Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Topics and bidirectional links between pages #639

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

kouloumos
Copy link
Contributor

@kouloumos kouloumos commented Mar 11, 2023

Overview

I see the PR Review Club as one of the best resources for context gathering. And this is mostly due to the notes that the individual hosts are putting together each week (meeting logs also, but I'll get back to these in a future idea). Through my experience participating and exploring past meetings I have also recognized the limitation described in #621. I have been thinking of ways to address those limitations, as well as some incremental changes that can potentially bring this resource closer to its full potential.

Current limitations

  • Repeating topics. As also pointed out in Add a Glossary or something similar #621, there are a lot of re-defintions of topics. Except of the additional burden that this adds to the host, building on top of existing definitions could potentially result to a better and more future-proof resource.
  • Topics get outdated. Last year's inner workings of coin selection might not be the same as today. Having different definitions might be confusing.
  • Underutilized knowledge base. While this accumulation of notes is gradually creating a comprehensive knowledge base, the current navigation options are limited to filtering by PR component (p2p, mempool, rpc, etc.)

Solution

This PR is an attempt to solve these limitations by introducing "Topics" as a type of glossary for review club meetings. Additionally it introduces a way to automatically create bidirectional links between pages (pr pages, topic pages) which provides another way to navigate through the pages, essentially transforming this knowledge base to a knowledge graph.

This fixes #621 and also paves the way for #429 , as relatedness can easily calculated based on the topics mentioned in each page.

Implementation

The first commit introduces the bidirectional linking logic. This is based on the digital-garden-jekyll-template and the back-links UI is based on how Roam Research displays back-links in each page.

comparison
(above img: snippets from the back-links of the coin selection page. Left: my own notes in Roam, Right: review club's /topics/coin-selection page after this PR)

This works with the support for double-bracket link syntax [[]] which creates a bidirectional link from the current page to the linking page. You can see the logic (with comments) in bidirectional_links_generator.rb, and I have also included linking examples in the second commit alongside a rake command for new topics page generation.

With the introduction of topics and the internal linking of pages it makes sense to have a way to differentiate between internal and external links. This is done in the third commit by adding a subtle symbol next to external links. There, I also took the opportunity to automatically modify all external links to open to a new tab (if I am the only one bother by that, it can easily be reverted).

(removed for now)The _ commit is just an interesting addition of a graph view, again taken from the digital-garden-jekyll-template, it shows another way that pages can be explored when this kind of bidirectional relationships are defined. At its current form is not ideal, that's why is not accessible from anywhere else but the /graph link. I can drop this last commit, I just wanted to highlight another way that pages can be explored when this kind of bidirectional relationships are defined. Also, looking at the graph, it's easier to conceptualize how this can be used to calculated relatedness of review club meetings for #429 .

removed graph view graph

The fourth commit adds anchors to each list item. This is an idea based on bitcoinops, that allows each text block to be directly referenceable from within backlinks.

ezgif com-video-to-gif

TODO

  • Apply this linking to past PR pages in order to create better references between them and also connect them with the topics that they are dealing with. My second commit is just an example of how this could be done (manually).

Personal Note

I agree with #621 (comment) regarding the fragmentation of information. I've been doing a bit of thinking about the current fragmentation of information and how we could find a way to close the gap and enable better context gathering for current and future contributors. This is definitely not dealing with the fragmentation but I don't believe it enhances it. I see this as a way to improve the current knowledge base while also adding the notion of internal linking that in the future can potentially evolve to cross-site linking and bring the different overlapping knowledge bases a bit closer.

Copy link
Contributor

@glozow glozow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing @kouloumos, thank you so much for working on this! 🙌 🙌 🙌

The approach looks good to me. I really love that for the most part, future authors can link things by just adding [[]] around text.

Think of ways and if it makes sense to retroactively apply this linking to past PR pages in order to create better references between them and also connect them with the topics that they are dealing with.

Fwiw I think this is worth doing and happy to do it somewhat manually. Could we maybe just have a rake command to generate the topics page from a string (I'm sure I'd be able to come up with which terms)?

@kouloumos
Copy link
Contributor Author

🚀

Fwiw I think this is worth doing and happy to do it somewhat manually.

Nice! then I have some other questions about the approach.

  • The cool thing with this kind of linking is that you don't have to care about how to integrate a new block of information about a topic into the actual topic's page - you just [[]]link to it and someone who reads its page can automatically read all the related references. But I think that it still makes sense to have some definitions and insides in the actual page. How do we approach that? Do we choose what info to move inside the page and just delete those from the original? Or is better to not make significant edits to the original PR pages and just link to topics? That means that will be a clutter of somehow redundant back-links, see for example the current coin-selection page where I linked the definitions from a few different pages. Those could be merge into the topic's page and removed from the original pages.

  • Also, what do you think about notes that at the context of that review club were accurate but now they are not. For example #17331 talks about only 2 types of coin selection and explains a process of creating a transaction that might not be accurate in the future. When I revisited that PR at a later time, I was confused for a moment because other notes (that were written later) were different. If those explanations (coin selection, overview of creating a transaction) were under a topic then the reader could be sure about its uptodate status. But this might create the opposite issue, where someone reading through #17331 will follow the link to those topic and get confused because what they are looking at does not match the context of the discussions/pr/notes. That said, since all pages are under source control, it's easy to view the version of the page from the time of that review club. The only consideration is whether we need to hint to readers where to find that information, which depends on whether others see this as a potential issue, or if I'm simply overthinking it.

  • Should topic pages (including linking) be finalized as part of this PR? I think it could be a good idea as doing that we might observe issues with the implementation. On the other hand, going through all the PR pages needs time, so it might make sense to take a gradual approach of component by component basis.

Could we maybe just have a rake command to generate the topics page from a string

This makes sense as it would also make it easier for contributors to introduce topics in the future. I can add such a command as part of this PR.

@willcl-ark
Copy link
Contributor

Wow this is really cool!

@glozow
Copy link
Contributor

glozow commented Mar 13, 2023

But I think that it still makes sense to have some definitions and insides in the actual page. How do we approach that? Do we choose what info to move inside the page and just delete those from the original? Or is better to not make significant edits to the original PR pages and just link to topics?

I don't think we should delete anything from the original notes. It's fine to copy-paste definitions into the topic pages imo. Just collecting all the PRs about a topic is useful already - I think the current bullet point previews are really nice! It has most of the information right there (or 1 click away), and encourages clicking on those notes.

Also, what do you think about notes that at the context of that review club were accurate but now they are not.

I think this is a good argument for modifying the notes themselves. I don't think we should try to maintain a list of topics pages about things in code which are expected to change regularly. Things might get renamed or completely disappear, and that's fine. These are just logs and notes from meetings we had about PRs.

Should topic pages (including linking) be finalized as part of this PR? I think it could be a good idea as doing that we might observe issues with the implementation.

I don't think it's necessary to do all of the topics. I think the examples added are really nice, and later we can open followups to add more and more topics pages.

@kouloumos kouloumos force-pushed the topics-bidirectional-links branch 2 times, most recently from 3f91432 to bd2e924 Compare March 24, 2023 08:11
@kouloumos
Copy link
Contributor Author

The description of the PR has been updated with latest changes.

Gist of the updates:

  • backlinks' snippets now show the header of the section they were taken from.
  • new rake command for new topics' page generation.
  • Graph View menu option.
  • auto anchors based on bitcoinops, that allows each text block to be directly referenceable from within backlinks.

Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really cool. Great job @kouloumos !

I've left some specific feedback inline. A few general thoughts:

  • I think we should avoid writing long descriptions of topics that are widely relevant to technical users of Bitcoin, since those could just as easily be found on the Optech website or Bitcoin Stack Exchange. I think topics here should be concise definitions for topics that are specific to Bitcoin Core developers or Bitcoin protocol developers.

  • Perhaps we could add tooltips for the links to the topics page. If you hover over the link you get a tooltip popup that has a short (one sentence) description of the topic that's taken from the topic page's excerpt, and clicking on the link/tooltip takes you to the full topic page. Similar to wikipedia:
    image

  • The graph feature looks cool, but I don't think it's quite ready for merge yet. Perhaps pull that commit out into a separate WIP PR?

  • This greatly increases the time to build the site from scratch (on my machine it was 16.6s on master and 86.4s on this branch). Is there anything that can be done to improve that?

  • Do we choose what info to move inside the page and just delete those from the original? Or is better to not make significant edits to the original PR pages and just link to topics?

    I think we shouldn't make edits to old pages. It's too much effort to go back and change history.

  • Also, what do you think about notes that at the context of that review club were accurate but now they are not.

    Again, I don't think we should change history. It's an important skill to be able to determine the context of notes/PRs/mailing lists posts/etc. Each page has a published date, so it's easy enough to see what information is old and may be outdated.

  • Should topic pages (including linking) be finalized as part of this PR? I think it could be a good idea as doing that we might observe issues with the implementation.

    No - as long as we've got a few pages done that's enough to review the implementation. Adding all topics can be done later.

_includes/backlinks.html Outdated Show resolved Hide resolved
@@ -65,5 +65,6 @@ <h1>{{ page.title }} ({{components}})</h1>
<p>
{{ content }}
</p>
{% include backlinks.html %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'd add this to the PR page. Perhaps wait until the "related meetings" feature is added and just link to related meetings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is that you can reference a PR at another PR's page and it will show the backlink in the same way that is currently displayed at the topics pages. In my mind, "related meetings" could be broader than just meetings that reference each other, therefore backlinks in PRs pages is just another way to navigate around.
Based on the recent changes, if no backlinks exist, nothing will be displayed at the PR page.

_layouts/topic.html Outdated Show resolved Hide resolved
# frozen_string_literal: true
# This file is based on code from https://github.com/bitcoinops/bitcoinops.github.io

# Automatically adds id tags (anchors) to list items
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about making the bullet points (slightly) larger so that they're more visible as clickable elements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to keep the size the same as the regular bullets, but it's true that, in comparison to Optech's larger colored bullets, this doesn't make it obvious that they are clickable.
Just to point out, that clickable bullets are used in two places in the page:

  1. at each paragraph of the main page
  2. at each backlink snippet

I intentionally made (1) the same size as the previous(regular) bullets to not change the reader's experience and as I think that the clickability of the bullets in the main page doesn't offer so much value. I was also thinking of removing those, but as they are nonintrusive, I went with the extra utility for those that will notice.

For (2), it makes sense to make it a bit more obvious that they are clickable, so I just change them based on your suggestion.

Let me know if base on my reasoning you still think that (1) should also be bigger.

Comment on lines +13 to +15
# use the digest library to create deterministic ids based on text
id = Digest::MD5.hexdigest(slug_text)[0...7]
slug = "\#b#{id}" # prefix with 'b', ids cannot start with a number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting (and different from the optech version). What's the rationale for using a digest here rather than the text?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optech has a very specific pattern of writing paragraphs that allows for the extraction of the title that is later used for the slug creation. This is not true here, therefore using just the text results to very long slugs, and if try to concatenate it (e.g. first 10 characters) does not always result to unique ids.

_posts/2021-04-28-#17331.md Outdated Show resolved Hide resolved
_posts/2021-04-28-#17331.md Outdated Show resolved Hide resolved
@@ -105,14 +102,14 @@ commit: 4ac1add
algorithm is not guaranteed to find a solution even when there are
sufficient funds since an _exact match_ may not exist.

- Using _effective values_ in coin selection means that we can freely add and
- Using [[effective value]]s in [[coin selection]] means that we can freely add and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that subsequent mentions of a topic word should link to the topic page. Text becomes very noisy/messy if every instance of a word is turned into a link.

_topics/cconnman.md Show resolved Hide resolved
_topics/coin-selection.md Outdated Show resolved Hide resolved
@kouloumos kouloumos force-pushed the topics-bidirectional-links branch 2 times, most recently from 65167e9 to 4d947f9 Compare March 28, 2023 14:47
@kouloumos
Copy link
Contributor Author

kouloumos commented Mar 28, 2023

Thank you for the review @jnewbery! I've addressed your comments inline.

Regarding your general thoughts:

  • tooltips: is something that I played with during this PR but it had issues therefore right now is a nice-to-have as a future addition.
  • graph: That was also my initial sentiment, that's why I didn't expose it to the public at first. But it's not a good implementation and if someone takes the time to change it, they'll probably rewrite a big chunk of the logic, therefore it makes sense to drop it. graph commit dropped.
  • build time: The new logic for bidirectional linking does a bit of calculations and that's where most of the time is spent. As this build happens once on each update I saw the slowdown as ok. That said, your comment pushed me to do some optimizations (build time on my machine from 99.8s to 25s):
    • there was unnecessary logic in open_external_links_in_new_tab.rb(now mark_external_links.rb) which has now been removed, resulting to ~60% faster build time on my machine.
    • bidirectional_links_generator.rb is now O(m*n) instead of O(n^2) where n=all_pages and m=pages_with_link_syntax, which results to another ~35% speedup on top of the previous one.

The description of the PR has been updated with latest changes.

Gist of the updates:

  • graph view removed
  • external links only marked as external and not automatically open to a new tab
  • topics have the option to be formatted as code
  • increased the size of bullets in backlink snippets to be more visible as clickable items
  • support for [[topic]](github-link) pattern for when we want to link to the current (at the time of the meeting) source code of a topic. It adds a github icon link next to the internal link. example:
    image

@glozow
Copy link
Contributor

glozow commented Apr 11, 2023

Needs rebase?

- add support for topics as a type of glossary for review club meetings
- add support for calculating and displaying back-links between pages
based on double-bracket link syntax
- add topics: coin-selection, effective-value, waste-metric, bnb, srd,
cconnman, peermanager
- use the new  double-bracket syntax to create
bidirectional links between pages
- add rake `posts:new` command for topic page generation
- each text block is now referenceable
- direct linking to each snippet from within backlinks
@kouloumos
Copy link
Contributor Author

Rebased.

Copy link

@Ronnie6464 Ronnie6464 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a Glossary or something similar
5 participants